Uncertainty and sampling error.

نویسندگان

  • Douglas G Altman
  • J Martin Bland
چکیده

Medical research is conducted to help to reduce uncertainty. For example, randomised controlled trials aim to answer questions relating to treatment choices for a particular group of patients. Rarely, however, does a single study remove uncertainty. There are two reasons for this: sampling error and other (non-sampling) sources of uncertainty. The word “error” comes from a Latin root meaning “to wander,” and we use it in its statistical sense of meaning variation from the average, not “mistake.” Sampling error arises because any sample may not behave quite the same as the larger population from which it was drawn. Non-sampling error arises from the many ways a research study may deviate from addressing the question that the researcher wants to answer. Sampling error is very much the concern of the statistician, who imagines that the group of people in the study is just one of the many possible samples from the population of interest. Despite it being widely condemned, the dominant way of summarising the evidence from a research study is by the P value. It should be obvious that the evidence from a research study cannot reasonably be summarised as just a single number, but the use of P values remains unshakeable. Further, the practice of labelling P values as significant or not significant leads not only to dichotomous decisions but often also to the belief that the research question has been answered. P values represent the probability that the observed data (or a more extreme result) could have arisen when the true effect of interest is zero—for example, the true treatment effect in a randomised trial. It is common to interpret P<0.05 (“significant”) as clear evidence that there is a real effect, and P>0.05 (“not significant”) as evidence that there is no effect. However, the former interpretation may be unwise, and the latter is wrong. Although 0.05 is the conventional decision point, P<0.05 is far from representing certainty. One in 20 studies could have a difference of the observed size if there were really no difference in the population. “Not significant” indicates that we found insufficient evidence to conclude that there is a real effect, not that we have shown that there is not one. Referring to results as statistically significant, or not, only helps a bit. Interpretation of a study’s results should be primarily based on the estimated effect and a measure of its uncertainty. In mainstream statistics, the uncertainty of estimates is indicated by the use of confidence intervals. Before the mid-1980s, confidence intervals were rarely seen in clinical research articles. Around 1986 things changed, and these days almost all clinical research articles in major journals include confidence intervals. The confidence interval is a range of uncertainty around the estimate of interest, such as the treatment effect in a controlled trial. So, for example, in a study of the impact of a mental health worker on the management of depression in primary care, it was reported that “After adjustment for baseline depression, mean depression score was 1.33 PHQ-9 points lower (95% confidence interval 0.35 to 2.31, P=0.009) in participants receiving collaborative care than in those receiving usual care at four months.” This means that we estimate that, in the population which these trial participants represent, the average difference in mean depression score if all were offered collaborative care would be between 0.35 and 2.31 scale points less than if all were treated in the usual way. It is only an estimate. For 2.5% of studies the confidence interval will be entirely below the true population difference, and 2.5% will have the interval entirely above it. We don’t think “P=0.009” adds much to this, but researchers can seldom bear to do without it. The inevitable uncertainty from sampling error can be reduced by increasing the sample size, but usually only modestly. To halve the width of the confidence interval we would need to quadruple the sample size. A common mistake is to believe that the confidence interval expresses all the uncertainty. Rather, the confidence interval expressed uncertainty from just one cause—namely the uncertainty due to having taken a sample from the population defined by the inclusion criteria. Often there are other sources of uncertainty that may be even more important to consider, in particular relating to possibly biased results. We address these in our linked statistics note.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Bootstrap Interval Robust Data Envelopment Analysis for Estimate Efficiency and Ranking Hospitals

Data envelopment analysis (DEA) is one of non-parametric methods for evaluating efficiency of each unit. Limited resources in healthcare economy is the main reason in measuring efficiency of hospitals. In this study, a bootstrap interval data envelopment analysis (BIRDEA) is proposed for measuring the efficiency of hospitals affiliated with the Hamedan University of Medical Sciences. The propos...

متن کامل

Error and Uncertainty Quantification and Sensitivity Analysis Inmechanics Computationalmodels

Multiple sources of errors and uncertainty arise in mechanics computational models and contribute to the uncertainty in the final model prediction. This paper develops a systematic error quantification methodology for computational models. Some types of errors are deterministic, and some are stochastic. Appropriate procedures are developed to either correct the model prediction for deterministi...

متن کامل

Partial inspection problem with double sampling designs in multi-stage systems considering cost uncertainty

The nature of input materials is changed as long as the product reaches the consumer in many types of manufacturing processes. In designing and improving multi-stage systems, the study of the steps separately may not lead to the greatest possible improvement in the whole system, therefore the study of inputs and outputs of each stage can be effective in improving the output quality characterist...

متن کامل

Quantifying errors without random sampling

BACKGROUND All quantifications of mortality, morbidity, and other health measures involve numerous sources of error. The routine quantification of random sampling error makes it easy to forget that other sources of error can and should be quantified. When a quantification does not involve sampling, error is almost never quantified and results are often reported in ways that dramatically oversta...

متن کامل

Heterogenic Solid Biofuel Sampling Methodology and Uncertainty Associated with Prompt Analysis

Accurate determination of the properties of biomass is of particular interest in studies on biomass combustion or cofiring. The aim of this paper is to develop a methodology for prompt analysis of heterogeneous solid fuels with an acceptable degree of accuracy. Special care must be taken with the sampling procedure to achieve an acceptable degree of error and low statistical uncertainty. A samp...

متن کامل

Combined uncertainty factor for sampling and analysis

Measurement uncertainty that arises from primary sampling can be expressed as an uncertainty factor, which recognises its sometimes approximately log-normal probability distribution. By contrast, uncertainty arising from chemical analysis is usually expressed as relative uncertainty, based upon the assumptions of its approximately normal distribution. A new method is proposed that enables uncer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • BMJ

دوره 349  شماره 

صفحات  -

تاریخ انتشار 2014